The speech synthesis detection algorithm based on cepstral coefficients and convolutional neural network
نویسندگان
چکیده
The existing approaches to detecting synthesized speech, based on the current issues of synthesizing voice sequences, are considered. stages algorithm for spoofing attacks biometric systems described, and its final workflow is presented. research focuses mainly as it most dangerous type attacks. authors designed a software application an experimental study, present structure propose detection speech algorithm. This uses mel-frequency constant Q cepstral coefficients extract features. A Gaussian mixture model used construct user model. Convolutional neural network was chosen classifier determine voice’s authenticity. Two basic methods combating attacks, proposed by ASVspoof2019 competition, were selected making comparisons. One these involved using linear frequency features, while other method Q. Both solutions models classification. To evaluate effectiveness solution compare with methods, database created. EER minDCF metrics applied. results demonstrated advantages in comparison algorithms. An advantage that extracted features perform efficiently when comes identification. makes possible use optimize system has embedded protection against built synthesis. In addition, identification minimal modifications required. Voice have excellent opportunities banking sector. Such allow banks simplify accelerate process financial transactions provide their users advanced functions remotely. implementation difficult vulnerability particularly those conducted means can be integrated into improve security.
منابع مشابه
Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملDouble-Star Detection Using Convolutional Neural Network in Atmospheric Turbulence
In this paper, we investigate the usage of machine learning in the detection and recognition of double stars. To do this, numerous images including one star and double stars are simulated. Then, 100 terms of Zernike expansion with random coefficients are considered as aberrations to impose on the aforementioned images. Also, a telescope with a specific aperture is simulated. In this work, two k...
متن کاملNoise-Robust Speech Features Based on Cepstral Time Coefficients
In this paper, we investigate the noise-robustness of features based on the cepstral time coefficients (CTC). By cepstral time coefficients, we mean the coefficients obtained from applying the discrete cosine transform to the commonly used mel-frequency cepstral coefficients (MFCC). Furthermore, we apply temporal filters used for computing delta and acceleration dynamic features to the CTC, res...
متن کاملA Radon-based Convolutional Neural Network for Medical Image Retrieval
Image classification and retrieval systems have gained more attention because of easier access to high-tech medical imaging. However, the lack of availability of large-scaled balanced labelled data in medicine is still a challenge. Simplicity, practicality, efficiency, and effectiveness are the main targets in medical domain. To achieve these goals, Radon transformation, which is a well-known t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Nau?no-tehni?eskij Vestnik Informacionnyh Tehnologij, Mehaniki i Optiki
سال: 2021
ISSN: ['2226-1494', '2500-0373']
DOI: https://doi.org/10.17586/2226-1494-2021-21-4-545-552